Skip to content

Conversation

@zalza13
Copy link

@zalza13 zalza13 commented Sep 26, 2025

Summary

This PR introduces the first round of improvements for the new in-memory cache in LiteDB 5.
Goal: make the cache more predictable, configurable, and robust under heavy concurrent load, while keeping all tests stable.

Key changes

  1. Volatile on scalar tuning fields

    • Marked scalar knobs like _maxFreePages, _cleanupBatchSize, _extends as volatile to improve visibility across threads.
    • Time-based fields (_idleBeforeEvict, _minCleanupInterval) remain as TimeSpan (structs cannot be volatile).
  2. Safer on-demand cleanup triggers

    • Removed cleanup triggers from NewPage() and GetWritablePage() to avoid interfering with segment extension and UniqueID sequencing (fixes the failing Cache_UniqueIDNumbering test).
    • Cleanup still runs in the right places:
      • GetReadablePage() (hot read path),
      • DiscardPage() (when pages are freed),
      • Clear() (manual cleanup),
      • and when the free-list exceeds _maxFreePages (forced).
    • Bounded, non-reentrant cleanup (SemaphoreSlim) with batched eviction and idle policy.
  3. Accurate comments & readability

    • Updated Cleanup() docs to reference _cleanupBatchSize and _idleBeforeEvict (instead of stale constants).
    • Small English B1/B2 comment pass; clarified eviction/cleanup intent.
  4. Profiles groundwork

    • Added CacheProfile { Mobile, Desktop, Server } and ApplyProfile(profile).
    • Current defaults equal the Desktop profile.
    • Future work: pass the profile through an engine parameter (// TODO marker left in code).

Motivation

LiteDB is often used in embedded scenarios with high concurrency and limited resources.
These changes make cache tuning safer and clearer without adding locking overhead or altering public APIs.

Impact

  • No breaking API changes
  • No behavioral regressions: all tests pass with updated cleanup logic
  • Performance: negligible overhead for volatile scalars; improved stability for live-tuned reads

Related

Next steps

  • Wire CacheProfile selection via engine parameter (Etapa 2).
  • Add targeted stress tests for eviction/idle behavior under extreme concurrency.

Checklist

  • Target branch: dev-staging
  • Builds and tests pass locally
  • No public API changes
  • Documentation/comments updated
  • Style consistent with project guidelines

Notes for reviewers

  • This work was iterated with assistance from ChatGPT and Amazon Q for code review and validation ideas.
  • The cache was tested under heavy stress scenarios (high concurrency + forced cleanup) and behaved as expected without leaks or corruption.

- Add non-reentrant Cleanup() (SemaphoreSlim.Wait(0)) to prevent overlap
- Introduce TryCleanupOnDemand(): cleans on demand (no background thread)
  - Triggered from DiscardPage (force when free > cap)
  - Triggered after Extend() requeues pages (signal + on next tick)
  - Lightweight tick every OPS_CLEANUP_STEP ops (default 256)
- Eviction policy: idle + batch (DEFAULT_IDLE=60s, DEFAULT_BATCH=128)
- Stamp freed timestamp (UTC ticks) when moving readable -> free
  - DiscardPage/Extend/Clear now set page.Timestamp
- PageBuffer: atomic eviction guard (TryBeginEvict/MarkReusable) to avoid races
- Clear(): requeue readable pages, return count, wake cleanup if over cap
- Dispose(): block new triggers, wait any running cleanup, dispose lock
- Conservative defaults: MaxFreePages=200, Idle=60s, Batch=128, Interval=30s
- No public API changes

Refs: litedb-org#1756, litedb-org#2619, RFC litedb-org#2647
@JKamsker
Copy link
Collaborator

@codex review

@zalza13 zalza13 changed the title fix(engine/memorycache): harden MemoryCache with on-demand cleanup Engine/MemoryCache: Stage 1 improvements (RFC #2647) Sep 26, 2025
@chatgpt-codex-connector
Copy link

Codex Review: Didn't find any major issues. You're on a roll.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting

@JKamsker JKamsker changed the base branch from dev-staging to dev September 27, 2025 23:26
Copy link
Collaborator

@JKamsker JKamsker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few here - out of time - more later

{
switch (profile)
{
case CacheProfile.Mobile:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like some more "Dynamic" way:

  • CacheProfile becomes a struct or class
  • CacheProfile has 3 static factory properties or readonly static fields if the nonstatic fields are readonly.
  • CacheProfile can be configured with any of the 4 parameters mentioned (with checks)


counter++;
// Check if we exceed free pages limit
if (_free.Count > _maxFreePages)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_free.Count enumerates the whole collection. Try to use a cheaper volatile int and Interlocked. on it.

@JKamsker
Copy link
Collaborator

JKamsker commented Oct 4, 2025

Just the one i could confirm in the timeframe i have.
Otherwise here is a full review from chatgpt: https://gist.github.com/JKamsker/395ddb25cd24f7447216f82d4672a42b
I am also going through each one of those and validate its points - maybe you want to get ahead of me and also review its review - i think its quite solid

For the caching mechanism, something like this looks promising: Medium - Killing O(n): How Timing Wheels Expire 10 Million Keys Effortlessly in Golang. - putting cache items into timed buckets for a future worker to clean up

@zalza13
Copy link
Author

zalza13 commented Oct 31, 2025

Hi @JKamsker

I haven't had time to review the GPT feedback regarding this change :(

Just as a comment, I'd like to add that this version (which is possibly incomplete) is much better than the current version of LiteDb 5, which has a severe memory leak.

My idea was to submit three versions: this would be the first, then the second with improvements, and the last with the final tuning, but as I said, it became impossible.

I'll do my best to resume this fix, which I believe is very necessary for LiteDb.

Thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants